WC 2021

Program at a Glance

10th World Congress in Probability and Statistics

Plenary Thu-1: IMS Medallion Lecture (Daniela Witten) Plenary Thu-2: IMS Medallion Lecture (Andrea Montanari) Plenary Thu-3: Blackwell Lecture (Gabor Lugosi) Plenary Thu-4: Tukey Lecture (Sara van de Geer)

Invited 38: IMS Lawrence D. Brown Ph.D. Student Award Session (Organizer: Institute of Mathematical Statistics)

Invited 09: Quantum Statistics (Organizer: Cristina Butucea) Invited 22: Random Trees (Organizer: Anita Winter) Invited 29: High Dimensional Data Inference (Organizer: Florentina Bunea) Invited 34: Random Walks on Random Media (Organizer: Alexander Drewitz) Invited 37: Bernoulli Society New Researcher Award Session (Organizer: Bernoulli Society)

Invited 11: Analysis of Dependent Data (Organizer: Chae Young Lim) Invited 19: Randomized Algorithms (Organizer: Devdatt Dubhashi) Invited 23: Stochastic Partial Differential Equations (Organizer: Leonid Mytnik) Invited 26: Pathwise Stochastic Analysis (Organizer: Hendrik Weber)

Organized 11: Random Growth, Spatial Processes and Related Models (Organizer: Erik Bates) Organized 19: Recent Advances in Complex Data Analysis (Organizer: Seung Jun Shin) Organized 25: Recent Advances in Biostatistics (Organizer: Sangwook Kang) Organized 31: BOK Contributed Session: Finance and Contemporary Issues (Organizer: BOK Economic Statistics Department)

Organized 08: Rough Path Theory (Organizer: Ilya Chevyrev) Organized 21: Recent Advances in Statistics (Organizer: Yunjin Choi)

Organized 10: Random Conformal Geometry and Related Fields (Organizer: Nam-Gyu Kang) Organized 30: Stochastic Adaptive Optimization Algorithms and their Applications to Neural Networks (Organizer: Miklos Rasonyi & Sotirios Sabanis)

Contributed 09: Topics Related to RMT Contributed 11: Topics Related to KPZ Universality Contributed 21: Dimension Reduction and Model Selection Contributed 35: Financial Data Analysis

Contributed 04: Stochastic Processes and Related Topics Contributed 17: Various Limit Theorems Contributed 32: Statistical Modeling and Prediction

Poster III-1: Poster Session III-1

Organized Contributed Session (live Q&A at Track 1, 11:30AM KST)

Organized 11

Random Growth, Spatial Processes and Related Models (Organizer: Erik Bates)

Conference

11:30 AM — 12:00 PM KST

Local

Jul 21 Wed, 7:30 PM — 8:00 PM PDT

Holes in first-passage percolation

Wai-Kit Lam (University of Minnesota)

In first-passage percolation (FPP), one places i.i.d. nonnegative weights (t_e) on the nearest-neighbor edges on Z^d, and studies the induced psuedometric. One of the main goals in FPP is to understand the geometry of the metric ball B(t) centered at the origin with radius t. When the weights (t_e) have a heavy-tailed distribution, it is known that B(t) consists of a lot of small holes. It is natural to ask whether large holes typically exist; if yes, how large can they be? In an ongoing project with M. Damron, J. Gold and X. Shen, we show that for any distribution with P(t_e = 0) < p_c, a.s. for all large t, the size of the largest hole is of order at least \log{t}, and the number of holes is of order at least t^{d-1}. If we assume the limiting shape of B(t) is uniformly curved (which is unproven), we can also show that in two dimensions, a.s. the size of the largest hole is at most of order (\log{t})^C.

Coalescence estimates for the corner growth model with exponential weights

Xiao Shen (University of Wisconsin-Madison)

We establish estimates for the coalescence time of semi-infinite directed geodesics in the planar corner growth model with i.i.d. exponential weights. There are four estimates: upper and lower bounds for both fast and slow coalescence on the correct scale with exponent 3/2. The lower bound for fast coalescence is new and has the optimal exponential order of magnitude. For the other three, we provide proofs that do not rely on integrable probability or on the connection with the totally asymmetric simple exclusion process, in order to provide a template for the extension to other models. We utilize a geodesic duality introduced by Pimentel and properties of the increment-stationary last-passage percolation process.

Scaling limits of sandpiles

Ahmed Bou-Rabee (University of Chicago)

The Abelian sandpile is a diffusion process on the integer lattice which produces striking, kaleidoscopic patterns. I will discuss recent progress towards understanding these patterns and their stability under randomness.

Q&A for Organized Contributed Session 11

This talk does not have an abstract.

Session Chair

Erik Bates (University of California Berkeley)

Organized 19

Recent Advances in Complex Data Analysis (Organizer: Seung Jun Shin)

Conference

11:30 AM — 12:00 PM KST

Local

Jul 21 Wed, 7:30 PM — 8:00 PM PDT

Kernel density estimation and deconvolution under radial symmetry

Kwan-Young Bak (Korea University)

This study illustrates a dimensionality reduction effect of radial symmetry in nonparametric density estimation. To deal with the class of radially symmetric functions, we adopt a generalized translation operation that preserves the symmetry structure. Radial kernel density estimators based on directly or indirectly observed random samples are proposed. For the latter case, we analyze deconvolution problems with four distinct scenarios depending on the symmetry assumptions on the signal and noise. Minimax upper and lower bounds are established for each scheme to investigate the role of the radial symmetry in determining optimal rates of convergence. The results confirm that the radial symmetry reduces the dimension of the estimation problems so that the optimal rate of convergence coincides with the univariate convergence rate except at the origin where a singularity occurs. The results also imply that the proposed estimators are rate optimal in the minimax sense for the Sobolev class of densities.

Penalized poly gram regression for bivariate smoothing

Jae-Hwan Jhong (Chungbuk National University)

We consider the problem of estimating a bivariate function over the plane using triangulation and penalization techniques. In order to provide a spatially adaptive method, total variation penalty for the bivariate spline function is considered to remove unnecessary common edges for the initial triangulation. A coordinate descent algorithm which we introduce can efficiently solve the convex optimization problem to handle the total variation penalty. The proposed estimator which is called Penalized Polygram Regression (PPR) has a form of piecewise linear and continuous within the adjacent polygons, not limited on triangles, and its corresponding basis functions can be obtained by the coordinate descent algorithm for eliminating common edges proceeds. Numerical studies using both simulated and real data examples are provided to illustrate the performance of the proposed method.

Penalized logistic regression using functional connectivity as covariates with an application to mild cognitive impairment

Eunjee Lee (Chungnam National University)

There is an emerging interest in brain functional connectivity (FC) based on functional Magnetic Resonance Imaging in Alzheimer’s disease (AD) studies. The complex and high-dimensional structure of FC makes it challenging to explore the association between altered connectivity and AD susceptibility. We develop a pipeline to refine FC as proper covariates in a penalized logistic regression model and classify normal and AD susceptible groups. Three different quantification methods are proposed for FC refinement. One of the methods is dimension reduction based on common component analysis (CCA), which is employed to address the limitations of the other methods. We applied the proposed pipeline to the Alzheimer’s Disease Neuroimaging Initiative (ADNI) data and deduced pathogenic FC biomarkers associated with AD susceptibility. The refined FC biomarkers were related to brain regions for cognition, stimuli processing, and sensorimotor skills. We also demonstrated that a model using CCA performed better than others in terms of classification performance and goodness-of-fit.

Resmax: detecting voice spoofing attacks with residual network and max filter map

Il-Youp Kwak (Chung-Ang University)

The 2019 automatic speaker verification spoofing and countermeasures challenge (ASVspoof) competition aims to facilitate the design of highly accurate voice spoofing attack detection systems. However, they do not emphasize model complexity and latency requirements such constraints are strict and integral in real-world deployment. Hence, most of the top performing solutions from the competition use an ensemble approach, and combine multiple complex deep learning models to maximize detection accuracy. This kind of approach would sit uneasily with real-world deployment constraints. To design a light weight system, we combine the notions of skip connection (from ResNet) and max filter map (from Light CNN), and evaluate its accuracy using the ASVspoof 2019 dataset by optimizing a well known signal processing feature called constant Q transform (CQT), our single model achieved a spoofing attack detection equal error rate (EER) of 0.16%, outperforming the top ensemble system from the competition that achieved an EER of 0.39%

Weighted validation of heteroscedastic regression models for better selection

Yoonsuh Jung (Korea University)

Statistical modeling can be divided into two processes: model fitting and model selection for the given task. For model fitting, it is vital to select the appropriate type of model to use. This step is taken first. For model selection, the model is fine-tuned via variable and parameter selection. Improving model selection in the presence of heteroscedasticity is the main goal of this talk. Model selection is usually conducted by measuring the prediction error. When there is heteroscedasticity in the data, observations with high variation tend to produce larger prediction errors. In turn, model selection is strongly effected by observations with large variation. To reduce the effect of heteroscedastic data, we propose weighted selection during the model selection process. The proposed method reduces the impact of large prediction errors via weighted prediction and leads to better model and parameter selection. The benefits of the proposed method is demonstrated in simulations and with two real data sets.

Q&A for Organized Contributed Session 19

This talk does not have an abstract.

Session Chair

Seung Jun Shin (Korea University)

Organized 25

Recent Advances in Biostatistics (Organizer: Sangwook Kang)

Conference

11:30 AM — 12:00 PM KST

Local

Jul 21 Wed, 7:30 PM — 8:00 PM PDT

Bayesian nonparametric adjustment of confounding

Chanmin Kim (SungKyunKwan University)

In observational studies, confounder selection is a crucial task in estimation of causal effect of an exposure. Wang et al. (2012, 2015) propose Bayesian adjustment methods for confounding (BAC) to account for the uncertainty in confounder selection by jointly fitting parametric models for exposure and outcome, in which Bayesian model averaging (BMA) is utilized to obtain the causal effect averaged across all potential models according to their posterior weights. In this work, we propose a Bayesian nonparametric approach to select confounders and estimate causal effects without assuming any model structures for exposure and outcome. With the Bayesian additive regression trees (BART) method, the causal model can capture complex data structure flexibly and select a subset of true confounders by specifying a common prior on the selection probabilities in both exposure and outcome models. The proposed model does not require a separate BMA process to average effects across many models as, in our method, selection of confounders and estimation of causal effects based on the selected confounders are processed simultaneously within each MCMC iteration. A set of extensive simulation studies demonstrates that the proposed method outperforms in a variety of situations.

Multivariate point process models for microbiome image analysis

Kyu Ha Lee (Harvard University)

We investigate the spatial distribution of microbes to understand the role of biofilms in human and environmental health. Advances in spectral imaging technologies enable us to display how different taxa (e.g. species or genera) are located relative to one another and to host cells. However, most commonly used quantitative methods are limited to describing spatial patterns of bivariate data. Therefore, we propose a flexible multivariate spatial point process model that can quantify spatial relationships among the multiple taxa observable in biofilm images. We have developed an efficient computational scheme based on the Hamiltonian Monte Carlo algorithm, implemented in the R package. We applied the proposed model to tongue biofilm image data.

Look before you leap: systematic evaluation of tree-based statistical methods in subgroup identification

Xiaojing Wang (University of Connecticut)

Subgroup analysis, as the key component of personalized medicine development, has attracted a lot of interest in recent years. While a number of exploratory subgroup searching approaches have been proposed, informative evaluation criteria and scenario-based systematic comparison of these methods are still underdeveloped topics. In this article, we propose two evaluation criteria in connection with traditional type I error and power concepts, and another criterion to directly assess recovery performance of the underlying treatment effect structure. Extensive simulation studies are carried out to investigate empirical performance of a variety of tree-based exploratory sub- group methods under the proposed criteria. A real data application is also included to illustrate the necessity and importance of method evaluation.

Q&A for Organized Contributed Session 25

This talk does not have an abstract.

Session Chair

Sangwook Kang (Yonsei Univesity)

Organized 31

BOK Contributed Session: Finance and Contemporary Issues (Organizer: BOK Economic Statistics Department)

Conference

11:30 AM — 12:00 PM KST

Local

Jul 21 Wed, 7:30 PM — 8:00 PM PDT

Multi-step reflection principle and barrier options

Seongjoo Song (Korea University)

This paper examines a class of barrier options—multi-step barrier options, which can have any finite number of barriers of any level. We obtain a general, explicit expression of option prices of this type under the Black-Scholes model. Multi-step barrier options are not only useful in that they can handle barriers of different levels and time steps, but can also approximate options with arbitrary barriers. Moreover, they can be embedded in financial products such as deposit insurances based on jump models with simple barriers. Along the way, we derive multi-step reflection principle, which generalizes the reflection principle of Brownian motion.

Change point analysis in Bitcoin return series: a robust approach

Junmo Song (Kyungpook National University)

Over the last decade, Bitcoin has attracted a great deal of public interest and along with this, the Bitcoin market has grown rapidly. Its speculative price movements have also drawn the interest of many researchers as well as financial investors. Accordingly, numerous studies have been devoted to the analysis of Bitcoin, more exactly the volatility modelling of Bitcoin returns. In this study, we are interested in change point analysis in Bitcoin return data. Since Bitcoin returns has some outlying observations that can affect statistical inferences undesirably, we use a robust test for parameter change to locate some significant change points. We report some change points that are not detected by the existing tests and demonstrate that the model with parameter changes are better fitted to the data. Finally, we show that the model incorporating parameter change can improve the forecasting performance of Value-at-Risk.

A self-normalization test for correlation matrix change

Ji Eun Choi (Pukyong National University)

We construct a new test for correlation matrix break based on the self-normalization method. The self-normalization test has practical advantage over the existing test: easy and stable implementation; not having the singularity issue and the bandwidth selection issue of the existing test; remedying size distortion problem of the existing test under (near) singularity, serial dependence, conditional heteroscedasticity or unconditional heteroscedasticity. This advantage is demonstrated experimentally by a Monte-Carlo simulation and theoretically by showing no need for estimation of complicated covariance matrix of the sample correlations. We establish the asymptotic null distribution and consistency of the self-normalization test. We apply the correlation matrix break tests to the stock log returns of the companies of 10 largest weight of the NASDAQ 100 index and to five volatility indexes for options on individual equities.

Volatility as a risk measure of financial time series : high frequency and realized volatility

Sun Young Hwang (Sookmyung Women's University)

Volatility as a risk measure is defined as a time varying variance process of return of an asset. The GARCH have been useful to model volatilities of various financial time series. This talk reviews standard volatility computations via GARCH models and then discusses recent issues such as multivariate volatility, realized volatility and high frequency volatility of financial time series. To illustrate, applications to various Korean financial time series are made.

Q&A for Organized Contributed Session 31

This talk does not have an abstract.

Session Chair

Changryoung Baek (Sungkyunkwan University)

Program at a Glance

10th World Congress in Probability and Statistics

Organized Contributed Session (live Q&A at Track 1, 11:30AM KST)

Random Growth, Spatial Processes and Related Models (Organizer: Erik Bates)

Holes in first-passage percolation

Coalescence estimates for the corner growth model with exponential weights

Scaling limits of sandpiles

Q&A for Organized Contributed Session 11

Session Chair

Recent Advances in Complex Data Analysis (Organizer: Seung Jun Shin)

Kernel density estimation and deconvolution under radial symmetry

Penalized poly gram regression for bivariate smoothing

Penalized logistic regression using functional connectivity as covariates with an application to mild cognitive impairment

Resmax: detecting voice spoofing attacks with residual network and max filter map

Weighted validation of heteroscedastic regression models for better selection

Q&A for Organized Contributed Session 19

Session Chair

Recent Advances in Biostatistics (Organizer: Sangwook Kang)

Bayesian nonparametric adjustment of confounding

Multivariate point process models for microbiome image analysis

Look before you leap: systematic evaluation of tree-based statistical methods in subgroup identification

Q&A for Organized Contributed Session 25

Session Chair

BOK Contributed Session: Finance and Contemporary Issues (Organizer: BOK Economic Statistics Department)

Multi-step reflection principle and barrier options

Change point analysis in Bitcoin return series: a robust approach

A self-normalization test for correlation matrix change

Volatility as a risk measure of financial time series : high frequency and realized volatility

Q&A for Organized Contributed Session 31

Session Chair